Comparison of Support Vector Machine and Artificial Neural Network Systems for Drug/Nondrug Classification
نویسندگان
چکیده
Support vector machine (SVM) and artificial neural network (ANN) systems were applied to a drug/nondrug classification problem as an example of binary decision problems in early-phase virtual compound filtering and screening. The results indicate that solutions obtained by SVM training seem to be more robust with a smaller standard error compared to ANN training. Generally, the SVM classifier yielded slightly higher prediction accuracy than ANN, irrespective of the type of descriptors used for molecule encoding, the size of the training data sets, and the algorithm employed for neural network training. The performance was compared using various different descriptor sets and descriptor combinations based on the 120 standard Ghose-Crippen fragment descriptors, a wide range of 180 different properties and physicochemical descriptors from the Molecular Operating Environment (MOE) package, and 225 topological pharmacophore (CATS) descriptors. For the complete set of 525 descriptors cross-validated classification by SVM yielded 82% correct predictions (Matthews cc = 0.63), whereas ANN reached 80% correct predictions (Matthews cc = 0.58). Although SVM outperformed the ANN classifiers with regard to overall prediction accuracy, both methods were shown to complement each other, as the sets of true positives, false positives (overprediction), true negatives, and false negatives (underprediction) produced by the two classifiers were not identical. The theory of SVM and ANN training is briefly reviewed.
منابع مشابه
Prediction of true critical temperature and pressure of binary hydrocarbon mixtures: A Comparison between the artificial neural networks and the support vector machine
Two main objectives have been considered in this paper: providing a good model to predict the critical temperature and pressure of binary hydrocarbon mixtures, and comparing the efficiency of the artificial neural network algorithms and the support vector regression as two commonly used soft computing methods. In order to have a fair comparison and to achieve the highest efficiency, a comprehen...
متن کاملAssessment the Performance of Support Vector Machine and Artificial Neural Network Systems for Regional Flood Frequency Analysis (A Case Study: Namak Lake Watershed)
Flood discharge estimation with different return periods is one of important factors for water structures design and installation. On the other hand, a lot of rivers existing in Iran watersheds have no complete and accurate hydrometric data. In these cases, one of the suitable solutions to estimate peak discharges with different return periods is the regional flood analysis. In this research, 5...
متن کاملComparison of classic regression methods with neural network and support vector machine in classifying groundwater resources
In the present era, classification of data is one of the most important issues in various sciences in order to detect and predict events. In statistics, the traditional view of these classifications will be based on classic methods and statistical models such as logistic regression. In the present era, known as the era of explosion of information, in most cases, we are faced with data that c...
متن کاملBubble Pressure Prediction of Reservoir Fluids using Artificial Neural Network and Support Vector Machine
Bubble point pressure is an important parameter in equilibrium calculations of reservoir fluids and having other applications in reservoir engineering. In this work, an artificial neural network (ANN) and a least square support vector machine (LS-SVM) have been used to predict the bubble point pressure of reservoir fluids. Also, the accuracy of the models have been compared to two-equation stat...
متن کاملApplication of Artificial Neural Networks in a Two-step Classification for Acute Lymphocytic Leukemia Diagnosis by Blood Lamella Images
Introduction: This study aimed to present a system based on intelligent models that can enhance the accuracy of diagnostic systems for acute leukemia. The three parts including preprocessing, feature extraction, and classification network are considered as associated series of actions. Therefore, any dysfunction or poor accuracy in each part might lead in general dysfunction of...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of chemical information and computer sciences
دوره 43 6 شماره
صفحات -
تاریخ انتشار 2003